AITopics | core state

Collaborating Authors

core state

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Efficient Planning in Large MDPs with Weak Linear Function Approximation

Roshan Shariff & Csaba Szepesvári

Neural Information Processing SystemsFeb-19-2026, 07:59:11 GMT

To achieveour result, westartfrom theapproximatelinear programming(ALP) approach where the value function is approximated using the feature vectors.

algorithm, artificial intelligence, machine learning, (18 more...)

Neural Information Processing Systems

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
North America > United States > New Jersey > Hudson County > Hoboken (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)

Add feedback

In this paper we consider the intersection of these two problem formulations.

artificial intelligence, machine learning, reinforcement learning, (21 more...)

Neural Information Processing Systems

Country:

North America > Canada > Alberta (0.14)
Asia > Middle East > Jordan (0.04)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
(3 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.94)
Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (0.69)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Fuzzy Logic (0.41)

Add feedback

Review for NeurIPS paper: Efficient Planning in Large MDPs with Weak Linear Function Approximation

Neural Information Processing SystemsFeb-7-2025, 04:49:35 GMT

All reviewers agree that the paper makes a nice contribution to planning with function approximation. In particular, the paper considers an important open problem, and while the problem is solved by making a few assumptions (mostly notably the core states), the results have made significant progress on the important problem. The reviewers also appreciate the use of precise language and careful description of related work. Among the remaining concerns, R2 wants to see some evidence of robustness against the failure of the "core state" assumption. While performing empirical experiments may not fit the theoretical nature of the paper, the authors can consider a theoretical justification: namely, define a notion of error that measures how much the core-states assumption is violated, and show how such an error manifest itself in the final guarantee.

efficient planning, neurips paper, weak linear function approximation, (4 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Fuzzy Logic (0.66)

Add feedback

Goal-oriented inference of environment from redundant observations

Takahashi, Kazuki, Fukai, Tomoki, Sakai, Yutaka, Takekawa, Takashi

arXiv.org Artificial IntelligenceMay-7-2023

The agent learns to organize decision behavior to achieve a behavioral goal, such as reward maximization, and reinforcement learning is often used for this optimization. Learning an optimal behavioral strategy is difficult under the uncertainty that events necessary for learning are only partially observable, called as Partially Observable Markov Decision Process (POMDP). However, the real-world environment also gives many events irrelevant to reward delivery and an optimal behavioral strategy. The conventional methods in POMDP, which attempt to infer transition rules among the entire observations, including irrelevant states, are ineffective in such an environment. Supposing Redundantly Observable Markov Decision Process (ROMDP), here we propose a method for goal-oriented reinforcement learning to efficiently learn state transition rules among reward-related "core states'' from redundant observations. Starting with a small number of initial core states, our model gradually adds new core states to the transition diagram until it achieves an optimal behavioral strategy consistent with the Bellman equation. We demonstrate that the resultant inference model outperforms the conventional method for POMDP. We emphasize that our model only containing the core states has high explainability. Furthermore, the proposed method suits online learning as it suppresses memory consumption and improves learning speed.

artificial intelligence, machine learning, state space, (17 more...)

arXiv.org Artificial Intelligence

2305.04432

Country:

Asia > Middle East > Jordan (0.04)
Asia > Japan > Kyūshū & Okinawa > Okinawa (0.04)

Genre: Research Report (0.82)

Industry: Education > Educational Setting (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Add feedback

Efficient Planning in Large MDPs with Weak Linear Function Approximation

Shariff, Roshan, Szepesvári, Csaba

arXiv.org Machine LearningJul-13-2020

Large-scale Markov decision processes (MDPs) require planning algorithms with runtime independent of the number of states of the MDP. We consider the planning problem in MDPs using linear value function approximation with only weak requirements: low approximation error for the optimal value function, and a small set of "core" states whose features span those of other states. In particular, we make no assumptions about the representability of policies or value functions of non-optimal policies. Our algorithm produces almost-optimal actions for any state using a generative oracle (simulator) for the MDP, while its computation time scales polynomially with the number of features, core states, and actions and the effective horizon.

artificial intelligence, machine learning, reinforcement learning, (22 more...)

arXiv.org Machine Learning

2007.06184

Country: